- some random AV during training of many letters caused by wrong handling of training splash indicator.
NOTE: the displayed number on indicator doesn't means anything - it simple shows that "something happens" and couldn't be used to calculate the actual number of trained letters. The splash itself can vanish during training for some moments, and also the number can be reset to zero (especially on training of every 1000 letters, since the filter flushes it's internal buffers).
NOTE: due to fix possibly errors caused by the recent bugs, if you update from the version earlier than 0.7.0 rc3, please repair your bases using NEW "lstrepair" utility (http://www.ritlabs.com/download/bayesit/lstrepair.rar) - just exit The Bat!, place the utility into bayesit\base folder and run it!
- hungs or error when marking spam at the beginning of training.
- on some machines hungs on startup because of wrong .lng parser (suggestion).
- some minor bugs in train-base thread;
* train-base thread optimized;
+ new option in "advanced.ini" file: "use average instead of bayesian" - if set to "1", caused bayesit to calculate simple arithmetic average of token's weigths instead of using bayesian formula. It gives smooth and linear result of spaminess. To switch on the option, just exit The Bat!, open advanced.ini file and add a line: use average instead of bayesian="1" (1 = on, 0 = off).
NOTE: due to fix possibly errors caused by the recent bugs, if you update from the version earlier than 0.7.0 rc2, please repair your bases using "lstrepair" utility (http://www.ritlabs.com/download/bayesit/lstrepair.rar) - just exit The Bat!, place the utility into bayesit\base folder and run it!
- due to a bug in btree* realization the message index (.idx) sometimes (1 message from 15000) may be stripped and the message base (.lst) may be corrupted.
- hungs on unsuccessfull attempt of delete of temporary files.
NOTE: due to fix possibly errors caused by the first bug in the list, this distribution also includes repairing utility "lstrepair". Just exit The Bat!, place the utility into bayesit\base folder and run it!
* more correctly handling of versioninfo to show filter's current version.
- when receiving/marking as of some letters, only one of them was actually added to database, which causes unexpected disbalance of the base and wrong grades given to letters.
- the identifier of the last letter in the base was duplicated on evey background recalculation.
NOTE: since the bug fixed, it is recommended to re-mark as junk/not junk all letters received last 2-3 weeks.
- a minor bug caused "not contains" rules being saved as "contains";
+ supporting for LNG with embedded versioninfo (like The Bat!)
+ supporting of PCRE which is embedded into thebat.exe - this version doesn't need PCRE.dll anymore! (however, if an external PCRE.dll is placed in the same folder with thebat.exe, it will be loaded instead of using internal PCRE)
Version 0.6.9 (not for "clean" installation - just .tbp to replace one from the previous (0.6.5, for example))
- major bug in Btree+ structure (used as the core for determining letter's uniquiness by the files .idx) which caused unstable working in release build.
*** if you used previous (0.6.3) version, it is recommended to open the folder "transact" (which is placed in the "base" folder in the filter's working folder), open the file "last.txt" and change the value inside to "0".
*** if you used previous (0.6.3 and below) version it is recommended to delete all .idx files in "base" folder which in the filter's working folder.
- a little correction which caused a heap of bugs since MS VC runtime library is NOT fully compliant to ANSI C++ standart. - Please, review all latest bugs in bugtrack.
- stucking of the bat! at the middle of the process of receiving letters...
* this build includes debugging info which can help to localize errors more quickly and accurate.
* some hooks for exception handling - if an error arises, the message box is shown which displayed the code and place of exception, and also the exception is written into bayesit.log
* some speed-critical functions are rewritten on pure assembler.
- [ID 3460] Initial White List & Black List Rules cannot be saved
- error in processing of "header" filters
+ visual debugger for regular expressions (available during edition of a rule with condition "match" or "not match" by "regex" button, the button "tree" on the debugger's window)
- the log's options "global events" didn't influent to the logging of white/black/ignore list global reloads
- the "headers" target in b/w rules wasn't saved [ID 3413].
- the error on Win9X when trying to save a rules: "This function is not supported on this system".
- it was impossible to use a regular expressions which used the language other than english (or, "C"-locale) (PCRE.dll bug).
- "Selective.txt" file is improved to be more adequate for the current bayesian principles.
- Some internal bugs which caused spontaneous crashing from time to time. [ID 3286]?
+ Small regular expression's debugger which became available by selecting "matches" or "don't matches" in the "actions" of the b\w rules. (Hint: if you check "highlight matches" then the whole matched pattern will be highlighted in the input text with blue textcolor, and any selected part of the pattern will be highlighted by red color).
- [b/w] the filter doesn't report anything when it was impossible to save b/w list by any reason.
- [b/w] the list of rules doesn't change after modifying a rule using "edit" button.
- [b/w] regexp rules wasn't available for localized (for example, russian) signal string because PCRE used default locale.
- when you change the interface language, all menu were still shown in the previous language.
WARNING! Since the new The Bat! CP-API is in the "pre-alpha" state, after you change Bayesit! interface language you possibly will see the AV message when you point cursor on the "?" or "Properties" main menu. Howere usually it is ony one message which is not causes a crash (at least under Windows XP).
- [b/w] if you edit the B/W rule which uses "match" or "not match" conditions with absent PCRE.DLL, the filter doesn't shew the right signal string.
- [b/w] the changed bayes grade-range (1..99 instead of 0..100) wasn't reflected in the log.
- [b/w] the reporting about b/w doesn't influented from "global filter's state" log's option.
* [b/w] if you just hit "Add rule" and press "OK" the filter created a rule with a blank signal string, which is matches _all_ incoming letters. The "notify" yes-no dialog added to help you to hold this situation (so, nothing wrong if you just want to "switch off" the filter by adding an empty rule into the ignore-list, but the filter will ask you about it to avoid a silly situation.
+ [b/w] the icons for the rules in the b/w options.
- a little bug with rounding of the values in statistic
- it was impossible to work with prce other than ver 4.4
- "Contains" modifier in the black/white rules wasn't recognized
* if a letter matches a blacklist rule, the result will always be 100, and if a letter matches a whitelist rule, the result will always be 0. For the origina bayesian method
the range is set to be between 1 and 99.
+ added white/black rules editor functionality - it is available either in the filter's preferences, either from the main menu of The Bat! in "Preferences" main menu.
+ added ignore list functionality - if a letter match to the rule of ignorelist, it will not be autolearned by the bayes base at all and the filter will return "wasn't processed" result to The Bat!
- the filter always used "enhanced judjing" and doesn't look at the appropriate option's flag
+ black/white lists filtering (with the same syntaxis as K9 filter uses) (see http://keir.net/k9_lists.html for details)
Note that b/w filtering may need a PCRE.dll library (if you use "match" keyword in the lists). This library is included into distribution, and also you can find it at www.pcre.org. Place this library into the same dir where TheBat.exe located.
* the indexes now stored in other form - all used *.idx files will became smaller for about 25-30% (all older indexes will be recreated automatically)
+ (only for The Bat! 2.12 and newer) item "About BayesIt!" was added to the main menu of The Bat! (in section "?", before The Bat's "About" item). It was made for testing purposes - to test the new CP-API which now is under creation.
- dialog "bayesit is performing training" wasn't localized.
- messages about successfull initial settings weren't localizable.
* if .lng file contains alphabet and translit table it is used by default (before it was necessary to switch it on from the settings of the filter).
* default settings is a bit changed (now logging is enabled by default).
* it is possible to change the size of controls in dialogs by .lng file localization. To do that at the very end of any strings with control's translation must be added a suffix with strict syntax: "|XXXXXXXX" (vertical line and immediatelly 8 hexadecimal digits). All digits must be capital (i.e. "ABCDEF", not "abcdef"). These digits are treated as 4 signed bytes: x, y, cx, cy. These bytes will be added to the coordinates of localizable control. For example, to move a control left by 10 units and make it wider by 20 units you must add at the end of translation of such control something like "|F0002000".
- when installing from The Bat! preferences menu it was necessary to configure the filter manually. Now it assumes all default settings itself.
- The window about "can't create folder" were shown in some false cases.
+ The filter assumes it's interface language during installation and tries to load appropriate language from the file bayesit.lng. The file is checked in the filter's working folder, filter's program folder, The Bat's program folder and finaly in the main The Bat! mail folder.
* some strings in English interface were changed and official source files for translators are now commited to CVS. Together with previous feature it makes BayesIt easy-localizable.
- version number returned by macro "Bayesitversion" was cutted.
* some improvement for "default" running (when the filter found no information about it's previous installation/bases).
! please note, that there were some problems with "first run" when installing with The Bat! 2.10.00. Since these problems were caused by The Bat! installer, you can fix this problem only by downloading The Bat! 2.10.01 or newer. As runaround for 2.10.00 just say "no" when The bat! installer prompts you about installation of bayesit - and then, install bayesit manually using The Bat! configuration-plugins menu. If you already install plugin, you can reset it's settings by deleting the file "advanced.ini" in plugin's working (no program!) folder, and by deleting the stored settings of bayesit which are located @ file tbplugins.ini (in main mail folder of The Bat!), at section "plugins data". Just find the line which is correspond to bayesit (by number) and delete all from "=" to the end of paragraph.
Version 0.5
+ statistic available by macros.
+ "local user alphabet" and "transliteration table" are placed into localizable resources
and can be predefined by translator - so, it may be much more easy for endusers to set up
the plugin
+ full options support (automatic migration for previos versions of BayesIt).
+ advanced settings are stored in external file advanced.ini
+ filter's version an current data about bases is written to log - it is uncomfortable to look at
"information" window if some report to Support is needed. Now all is in the log.
+ all resources (including strings for "information" window) have been made localizable.
+ realized enhanced judjing. Here is the meaning: if a letter contains more words with maximum equal
"interesting" than assigned number of judjing tokens cant hold, the filter will not cut this subset to assigned
number, but will use all possible tokens. This will help to catch the spam which uses a mass of neutral words
after actual spam message to "overload" the filter's grade by "good" words.
* optimized base recalculation. The tuning is available by editing bayesit.ini, parameter "recalculation
strategy". If it is integer number from 1 and more, recalculation is completed after filter collect
more than this number of letters. If it is float value below 1 (as 0.001), recalculation is completed
after filter collects the quantity of letters which is more than total number of letter which are
already in base multiplied to this number (for example, if you have 1000 spams and 2000 hams and
this parameter is set to 0.01, recalculation will be when you collect (1000+2000)*0.01 = 30 letters.
But anyway in this case recalculation will be done if number of collected letters is more than 100.
* statistic is saved every hour (if changed).
- if an error letter was marked opposite way in a short interval after receiving (usually - immediatelly),
it wasn't recalculated but was reported as "training dupe".
- there was big overloading (even like deadlock) when parsing huge uuencoded attachments;